Improving Chunking by Means of Lexical-Contextual Information in Statistical Language Models
نویسندگان
چکیده
In this work, we present a stochastic approach to shallow parsing. Most of the current approaches to shallow parsing have a common characteristic: they take the sequence of lexical tags proposed by a POS tagger as input for the chunking process. Our system produces tagging and chunking in a single process using an Integrated Language Model (ILM) formalized as Markov Models. This model integrates several knowledge sources: lexical probabilities, a contextual Language Model (LM) for every chunk, and a contextual LM for the sentences. We have extended the ILM by adding lexical information to the contextual LMs. We have applied this approach to the CoNLL-2000 shared task improving the performance of tile chunker.
منابع مشابه
Etymology, Contextual Pragmatic Clues, and Lexical Knowledge in L2 Idioms Learning
To investigate the effects of etymological elaboration, contextual pragmatic clues, and lexical knowledge on L2 idioms comprehension and production, 60 male intermediate level EFL students in three groups were selected. Each group was randomly assigned to one treatment condition. Group one participants were presented with the etymological explanation of idioms. In group two, the same idioms wer...
متن کاملEnriching machine-mediated speech-to-speech translation using contextual information
Conventional approaches to speech-to-speech (S2S) translation typically ignore key contextual information such as prosody, emphasis, discourse state in the translation process. Capturing and exploiting such contextual information is especially important in machine-mediated S2S translation as it can serve as a complementary knowledge source that can potentially aid the end users in improved unde...
متن کاملStatistical Machine Translation Using Coercive Two-Level Syntactic Transduction
We define, implement and evaluate a novel model for statistical machine translation, which is based on shallow syntactic analysis (part-of-speech tagging and phrase chunking) in both the source and target languages. It is able to model long-distance constituent motion and other syntactic phenomena without requiring a full parse in either language. We also examine aspects of lexical transfer, su...
متن کاملA Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks
The rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. Therefore, identifying the rumor language can be helpful in identifying it. The previous research has focused more on the contextual information to reply tweets and less on the content features of the original rumor to address the rumor detection problem. Most of the studies have been in...
متن کاملIntegrating Source-Language Context into Log-Linear Models of Statistical Machine Translation
The translation features typically used in state-of-the-art statistical machine translation (SMT) model dependencies between the source and target phrases, but not among the phrases in the source language themselves. A swathe of research has demonstrated that integrating source context modelling directly into log-linear phrasebased SMT (PB-SMT) and hierarchical PB-SMT (HPB-SMT), and can positiv...
متن کامل